weight update
Credit Assignment Through Broadcasting a Global Error Vector
Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success. That biological neural circuits appear to perform credit assignment, but cannot implement BP, implies the existence of other powerful learning algorithms. Here, we explore the extent to which a globally broadcast learning signal, coupled with local weight updates, enables training of DNNs. We present both a learning rule, called global error-vector broadcasting (GEVB), and a class of DNNs, called vectorized nonnegative networks (VNNs), in which this learning rule operates. VNNs have vector-valued units and nonnegative weights past the first layer. The GEVB learning rule generalizes three-factor Hebbian learning, updating each weight by an amount proportional to the inner product of the presynaptic activation and a globally broadcast error vector when the postsynaptic unit is active. We prove that these weight updates are matched in sign to the gradient, enabling accurate credit assignment. Moreover, at initialization, these updates are exactly proportional to the gradient in the limit of infinite network width. GEVB matches the performance of BP in VNNs, and in some cases outperforms direct feedback alignment (DFA) applied in conventional networks.
Neuronal Competition Groups with Supervised STDP for Spike-Based Classification
Spike Timing-Dependent Plasticity (STDP) is a promising substitute to backprop-agation for local training of Spiking Neural Networks (SNNs) on neuromorphic hardware. STDP allows SNNs to address classification tasks by combining unsupervised STDP for feature extraction and supervised STDP for classification. Unsupervised STDP is usually employed with Winner-Takes-All (WT A) competition to learn distinct patterns.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > China (0.04)
- Health & Medicine > Therapeutic Area > Neurology (0.69)
- Education (0.66)
Communication-efficientDistributedSGDwith Sketching
However,theoretical and empirical evidence both suggest that there is a maximum mini-batch size beyond which the number of iterations required toconvergestops decreasing, andgeneralization error begins toincrease [Maetal.,2017,Lietal., 2014, Golmant et al., 2018, Shallue et al., 2018, Keskar et al., 2016, Hoffer et al., 2017]. In this paper, we aim instead to decrease the communication cost per worker.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > Canada > Quebec > Montreal (0.04)
A Distinguishing supervised learning from reinforcement learning in a feedforward model { 1, 1} and t = 1,, T, are projected onto a hiddenlayer h
In order to illustrate the main idea from our paper in a simplified context, we show in this section how observed hidden-layer activity in a linear feedforward network can be used to infer the learning rule that is used to train the network. Consider the simple feedforward network shown in Fig. S1. N (0, Σ) is noise injected into the network. This is similar to learning with Feedback Alignment [4], except that here we do not assume that the readout weights are being learned. Equations (11) and (13) provide predictions for how the hidden-layer activity is expected to evolve under either SL or RL.
- North America > United States > Oregon (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)